Regional Pattern Discovery in Geo-referenced Datasets Using PCA

نویسندگان

  • Oner Ulvi Celepcikay
  • Christoph F. Eick
  • Carlos Ordonez
چکیده

Existing data mining techniques mostly focus on finding global patterns and lack the ability to systematically discover regional patterns. Most relationships in spatial datasets are regional; therefore there is a great need to extract regional knowledge from spatial datasets. This paper proposes a novel framework to discover interesting regions characterized by “strong regional correlation relationships” between attributes, and methods to analyze differences and similarities between regions. The framework employs a twophase approach: it first discovers regions by employing clustering algorithms that maximize a PCA-based fitness function and then applies post processing techniques to explain underlying regional structures and correlation patterns. Additionally, a new similarity measure that assesses the structural similarity of regions based on correlation sets is introduced. We evaluate our framework in a case study which centers on finding correlations between arsenic pollution and other factors in water wells and demonstrate that our framework effectively identifies regional correlation patterns.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Spatial dynamics for relative contribution of cropping pattern analysis on environment by integrating remote sensing and GIS

Agriculture resources reflected to be one of the most imperative renewable and dynamic natural resources. Agricultural sustainability has the premier priority in all countries, whether developed or developing. Cropping system analysis is indispensable for grinding the sustainability of agricultural science. Crop alternation is stated as growing one crop after another on the same piece of la...

متن کامل

Data Mining Techniques for Autonomous Exploration of Large Volumes of Geo-referenced Crime Data

We incorporate two knowledge discovery techniques, clustering and association-rule mining, into a fruitful exploratory tool for the discovery of spatio-temporal patterns. This tool is an autonomous pattern detector to reveal plausible cause-effect associations between layers of point and area data. We present two methods for this exploratory analysis and we detail algorithms to effectively expl...

متن کامل

Mining Geo-Referenced Databases: A Way to Improve Decision-Making

Knowledge discovery in databases is a process that aims at the discovery of associations within data sets. The analysis of geo-referenced data demands a particular approach in this process. This chapter presents a new approach to the process of knowledge discovery, in which qualitative geographic identifiers give the positional aspects of geographic data. Those identifiers are manipulated using...

متن کامل

A Unifying Framework for Clustering with Plug-In Fitness Functions and Region Discovery

The goal of spatial data mining [SPH05] is to automate the extraction of interesting and useful patterns that are not explicitly represented in spatial datasets. Of particular interests to scientists are techniques capable of finding scientifically meaningful regions in spatial datasets as they have many immediate applications in medicine, geosciences, and environmental sciences, e.g., identifi...

متن کامل

Sar Simulation Based Change Detection with High-resolution Sar Images in Urban Environments

Combined processing using different sensor types, i.e. for applications like change detection, requires a good geo-referencing. Furthermore the individual sensor properties have to be taken into account. SAR systems are side-looking and run-time systems. They suffer from occlusions and ambiguities especially in urban areas. Additionally layover and shadow effects disturb the geo-referencing of ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009